Parallel Active Learning: Eliminating Wait Time with Minimal Staleness

نویسندگان

  • Robbie Haertel
  • Paul Felt
  • Eric K. Ringger
  • Kevin Seppi
چکیده

A practical concern for Active Learning (AL) is the amount of time human experts must wait for the next instance to label. We propose a method for eliminating this wait time independent of specific learning and scoring algorithms by making scores always available for all instances, using old (stale) scores when necessary. The time during which the expert is annotating is used to train models and score instances–in parallel–to maximize the recency of the scores. Our method can be seen as a parameterless, dynamic batch AL algorithm. We analyze the amount of staleness introduced by various AL schemes and then examine the effect of the staleness on performance on a part-of-speech tagging task on the Wall Street Journal. Empirically, the parallel AL algorithm effectively has a batch size of one and a large candidate set size but eliminates the time an annotator would have to wait for a similarly parameterized batch scheme to select instances. The exact performance of our method on other tasks will depend on the relative ratios of time spent annotating, training, and scoring, but in general we expect our parameterless method to perform favorably compared to batch when accounting for wait time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probabilistically Bounded Staleness for Practical Partial Quorums

Modern storage systems employing quorum replication are often configured to use partial, non-strict quorums. These systems wait only for a subset of their replicas to respond to a request before returning an answer, without guaranteeing that read and write replica sets intersect. While these partial quorum mechanisms provide only basic eventual consistency guarantees, with no limit to the recen...

متن کامل

Teacher Wait-Time and Learner Initiation: A Single Case Analysis

The prevailing pattern of classroom interaction is a tripartite exchange structure known as IRF (teacher initiation, student response, teacher follow-up/feedback; Sinclair & Coulthard, 1975). Although it has its own contributions to classroom discourse, it has been criticized on several grounds, particularly for affording minimum learner participation opportunities (Kasper, 2001). An alternativ...

متن کامل

Teachers’ Limited Wait-Time Practice and Learners’ Participation opportunities in EFL Classroom Interaction

Pairing theory with methodology, this study demonstrates how EFL teachers’ limited wait-time practice structures in and affects the structuring of the unfolding classroom discourse with reference to learners’ participation opportunities. Informed by the tenets of conversation analysis, we have observed, videotaped, and transcribed line-by-line 10 EFL teachers’ naturally-occurring classroom inte...

متن کامل

Exploiting Bounded Staleness to Speed Up Big Data Analytics

Many modern machine learning (ML) algorithms are iterative, converging on a final solution via many iterations over the input data. This paper explores approaches to exploiting these algorithms’ convergent nature to improve performance, by allowing parallel and distributed threads to use loose consistency models for shared algorithm state. Specifically, we focus on bounded staleness, in which e...

متن کامل

Solving Re-entrant No-wait Flexible Flowshop Scheduling Problem; Using the Bottleneck-based Heuristic and Genetic Algorithm

In this paper, we study the re-entrant no-wait flexible flowshop scheduling problem with makespan minimization objective and then consider two parallel machines for each stage. The main characteristic of a re-entrant environment is that at least one job is likely to visit certain stages more than once during the process. The no-wait property describes a situation in which every job has its own ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010